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We consider kernel estimation of marginal densities and regres- 
sion functions of stationary processes. It is shown that for a wide 
class of time series, with proper centering and scaling, the maximum 
deviations of kernel density and regression estimates are asymptoti- 
cally Gumbel. Our results substantially generalize earlier ones which 
were obtained under independence or beta mixing assumptions. The 
asymptotic results can be applied to assess patterns of marginal den- 
sities or regression functions via the construction of simultaneous 
confidence bands for which one can perform goodness-of-fit tests. As 
an application, we construct simultaneous confidence bands for drift 
and volatility functions in a dynamic short-term rate model for the 
U.S. Treasury yield curve rates data. 

1. Introduction. Consider the nonparametric time series regression model 

(1.1) Y i = v(X i )dt + <r(X i )r H , 

where fi(-) [resp., o~ 2 (-)] is an unknown regression (resp., conditional vari- 
ance) function to be estimated, (Xi, Yi) is a stationary process and rji are un- 
observed independent and identically distributed (i.i.d.) errors with Era = 
and En? = 1. Let the regressor Xi be a stationarity causal process 

(1.2) X i = G(...,e i - 1 ,e i ), 

where £j are i.i.d. and the function G is such that Xi exists. Assume that rji is 
independent of (...,£$_!,£$). Hence, rji and (/j,(Xi),a(Xi)) are independent. 
As a special case of (1.1), a particularly interesting example is the nonlinear 
autoregressive model 

(1.3) Y i = (i(Y i - 1 ) + a(Y i - 1 )r H , 
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where Xj = and £j = T/j_i. Many nonlinear time series models are of 
form (1.3) with different choices of /x(-) and cr(-). If the form of fi(-) is not 
known, we can use the Nadaraya-Watson estimator 

(i.4) /%(*) = -^J2 K (^r^) Y ^ 

where K is a kernel function with K(-) > and J R K(u) du = 1, the band- 
widths b = b n — > and n6 n — >■ oo, and 



k=l v 7 



is the kernel density estimate of /, the marginal density of X{. Asymptotic 
properties of nonparametric estimates for time series have been widely dis- 
cussed under various strong mixing conditions; see Robinson (1983), Gyorfi 
et al. (1989), Tj0stheim (1994), Bosq (1996), Doukhan and Louhichi (1999) 
and Fan and Yao (2003), among others. 

Under appropriate dependence conditions [see, e.g., Robinson (1983), Wu 
and Mielniczuk (2002), Fan and Yao (2003) and Wu (2005)], we have the 
central limit theorem 

nb[/„(x) - Ef n (x)\ => N(0, X K f(x)) where A* = / K 2 {u) du. 

JR 

The above result can be used to construct point-wise confidence intervals 
of f(x) at a fixed x. To assess shapes of density functions so that one can 
perform goodness-of-fit tests, however, one needs to construct uniform or 
simultaneous confidence bands (SCB). To this end, we need to deal with the 
maximum absolute deviation over some interval [l,u]: 

(1-5) A n := sup -L== \f n (x)-EUx)\. 

l<x<u yJ\ K f(x) 

In an influential paper, Bickel and Rosenblatt (1973) obtained an asymp- 
totic distributional theory for A n under the assumption that X{ are i.i.d. 
It is a very challenging problem to generalize their result to stationary pro- 
cesses where dependence is the rule rather than the exception. In their paper 
Bickel and Rosenblatt applied the very deep embedding theorem of approx- 
imating empirical processes of independent random variables by Brownian 
bridges with a reasonably sharp rate [Brillinger (1969), Komlos, Major and 
Tusnady (1975, 1976)]. For stationary processes, however, such an approxi- 
mation with similar rates can be extremely difficult to obtain. Doukhan and 
Portal (1987) obtained a weak invariance principle for empirical distribu- 
tion functions. In 1998, Neumann (1998) made a breakthrough and proved 
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a very useful result for /3-mixing processes whose mixing rates decay ex- 
ponentially quickly. Such processes are very weakly dependent. For mildly 
weakly dependent processes, the asymptotic problem of A n remains open. 
Fan and Yao [(2003), page 208] conjectured that similar results hold for 
stationary processes under certain mixing conditions. Here we shall solve 
this open problem and establish an asymptotic theory for both short- and 
long-range dependent processes. It is shown that, for a wide class of short- 
range dependent processes, we can have a similar asymptotic distributional 
theory as Bickel and Rosenblatt (1973). However, for long-range dependent 
processes, the asymptotic behavior can be sharply different. One observes 
the dichotomy phenomenon: the asymptotic properties depend on the inter- 
play between the strength of dependence and the size of bandwidths. For 
small bandwidths, the limiting distribution is the same as the one under 
independence. If the bandwidths are large, then the limiting distribution is 
half-normal [cf. (2.9)]. 

A closely related problem is to study the asymptotic uniform distri- 
butional theory for the Nadaraya- Watson estimator fJ, n (x). Namely, one 
needs to find the asymptotic distribution for sup xgT |/i n ,(x) — [i(x)\, where 
T = [l,u]. With the latter result, one can construct an asymptotic (1 — a) 
SCB, < a < 1, by finding two functions fi l ° weT (x) and /j,n Pper {x), such that 

(1.6) lim P(/A ower (x) < fi(x) < /C pcr (x) for all x G T) = 1 - a. 

n— >oo 

The SCB can be used for model validation: one can test whether //(•) is of 
certain parametric functional form by checking whether the fitted parametric 
form lies in the SCB. Following the work of Bickel and Rosenblatt (1973), 
Johnston (1982) derived the asymptotic distribution of sup Q<x< i\fj, n (x) — 
E\p n (x)]\, assuming that (Xi,Yi) are independent random samples from a 
bivariate population. Johnston's derivation is no longer valid if dependence 
is present. For other work on regression confidence bands under indepen- 
dence see Knafl, Sacks and Ylvisaker (1985), Hall and Titterington (1988), 
Hardle and Marron (1991), Sun and Loader (1994), Xia (1998), Cummins, 
Filloon and Nychka (2001) and Diimbgen (2003), among others. Recently 
Zhao and Wu (2008) proposed a method for constructing SCB for stochastic 
regression models which have asymptotically correct coverage probabilities. 
However, their confidence band is over an increasingly dense grid of points 
instead of over an interval [see also Buhlmann (1998) and Knafl, Sacks and 
Ylvisaker (1985)]. Here we shall also solve the latter problem and establish 
a uniform asymptotic theory for the regression estimate /j, n (x), so that one 
can construct a genuine SCB for regression functions. A similar result will 
be derived for cr(-) as well. 

The rest of the paper is organized as follows. Main results are presented 
in Section 2. Proofs are given in Sections 4 and 5. Our results are applied 
in Section 3 to the U.S. Treasury yield rates data. 
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2. Main results. Before stating our theorems, we first introduce depen- 
dence measures. Assume Xk G C p , p> 0. Here for a random variable W, we 
write W G C p (p > 0), if \\W\\ P := (EI^T) 1 ^ < oo. Let {e^jjez be an i.i.d. 
copy of {ej}j(z Z ; let £n = (. . . ,e n _i,e n ) and 

X n = G (0 where C = (^-i> e o> e i>---> e n)- 

Here is a coupled process of X n with eo hi the latter replaced by an i.i.d. 
copy e' . Following Wu (2005), define the physical dependence measure 

@n,p — 1 1 An X n 

Let 9n )P = if n < 0. A similar quantity can be defined if we couple the whole 
past: let ££ n = (. . ■ , e' k _ n _ 2 , £' k _ n _ v £k-n,k) , k > n, where &j = (e i} e i+ i, . . . ,£j), 
and define 

(2-1) *n, P =\m n )-G(£ hn )\\ p . 

Our conditions on dependence will be expressed in terms of 6 n)P and \P niP . 

2.1. Kernel density estimates. We first consider a special case of (1.2) in 
which X n has the form 

(2.2) X n = a e n +g(. . . ,e n _ 2 ,e n -i) = a e n + g(£ n -i), 

where g is a measurable function and ao 7^ 0. Then the coupled process 
X' n = aQe n + g(£-i, s' , £\, . . . ,e n -i). We need the following conditions: 

(CI). There exists < 82 < 5\ < 1 such that n _<Sl = 0(b n ) and 6 n = 
0{n~ s *). 

(C2). Suppose that Xi G £ p for some p > 0. Let p' = min(p, 2) and n = 
Sr=o^f/ /2, Assume VE^y = 0(^~ 7 ) for some 7 > S\/(l — S\) and 

00 

(2.3) Z n bn~ l = o(logn) where Z n = (6 n +fc - Qfc) 2 - 

fc=— n 

(C3). The density function / £ of £1 is positive and 
sup[/ e (x) + |/^)| + |^(a;)|]< 00. 

(C4). The support of K is [—A, A] , where K is differentiable over (—.A, A), 
the right (resp., left) derivative K'(—A) [resp., K'(A)\ exists, and 
su P|x|<aI-^ 7 c )I < 00. The Lebesgue measure of the set {x G [— A, A] : K{x) = 
0} is zero. Let X K = f K 2 (y)dy, K x = [K 2 {-A) + K 2 {A)]/{2X K ) and K 2 = 
f* A (K'(t)) 2 dt/(2X K ). 
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Theorem 2.1. Let l,u G R be fixed and X n be of form (2.2). Assume 
(C1)-(C4). Then we have for every z G R, 

(2.4) P((21og6- 1 ) 1 / 2 (A n - dn) <z)^ e~ 2e ~\ 

where b = b/(u — I), 

if Ki > 0, and otherwise 

K 1 ' 2 



V 



We now discuss conditions (C1)-(C4). The bandwidth condition (CI) 
is fairly mild. In (C2), the quantity Q n measures the cumulative depen- 
dence of Xq,...,X u on eo, and, with (CI), it gives sufficient dependence 
and bandwidth conditions for the asymptotic Gumbel convergence (2.4). For 
short-range dependent linear process X n = Yl'jLo a 3 £ n-j with Egj = and 
Eef = 1, (C2) is satisfied if Yl'jLo \ a j \ < 00 an d S^=n°j = 0(n~ 7 ) for some 
7 > 2<5i/(l — S{). The latter condition can be weaker than X^=ol a jl < 00 
if Si < 1/3. Interestingly, (C2) also holds for some long-range dependent 
processes; see Theorem 2.3. With (C3), it is easily seen that Xj does have a 
density. If (C3) is violated, then Xi may not have a density. For example, if e% 
are i.i.d. Bernoulli with Pfe = 0) = P(e 4 = 1) = 1/2, then X = ££ A-i, 
where p = (v5 — l)/2, does not have a density [Erdos (1939)]. The kernel 
condition (C4) is quite mild and it is satisfied by many popular kernels. For 
example, it holds for the Epanechnikov kernel K{u) = 0.75(1 — u 2 )1m<i. 

In Theorem 2.2 below, we do not assume the special form (2.2). We need 
regularity conditions on conditional density functions. For jointly distributed 
random vectors £ and 77, let i^^(-) be the conditional distribution function 
of rj given £; let /^(x) = dF rj ^(x)/dx be the conditional density. For func- 
tion g with E|g(r/)| < 00, let E(g(r/)|£) = J g(x) dF v ^(x) be the conditional 
expectation of g(rj) given £. 

Conditions (C2) and (C3) are replaced, respectively, by: 

(C2)'. Suppose that Xi G C p and n>p = 0(p n ) for some p > and < p < 

1. 

(C3)'. The density function / is positive and there exists a constant B < 00 
such that 

sup[|/ Xn | Cn _»i + \f' Xn \^M + \fx n \z n -M] ^ B almost surel y- 
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Theorem 2.2. Under (CI), (C2)', (C3)' and (C4), we have (2.4). 

Many nonlinear time series models (e.g., ARCH models, bilinear models, 
exponential AR models) satisfy (C2)'; see Shao and Wu (2007). If (Xj) is 
a Markov chain of the form Xj = R(Xi-\,£i), where R(-,-) is a bivariate 
measurable function, then /x i |^„ 1 (") is the conditional density of Xj given 
Consider the ARCH model Xj = £j(a 2 + 6 2 X l 2 _ 1 ) 1 / 2 , where a > 0, b > 
are real parameters and e$ has density function f e , then /xi|Xi_i ( x ) = 
f £ (x/Hi)/Hi, where flj = (a 2 + 6 2 X 2 _ 1 ) 1 / 2 . So (C3)' holds if sup x [f £ {x) + 
|/^(x)| + \f" (x)|] < oo [cf. (C3)]. For more general ARCH-type processes see 
Doukhan, Madre and Rosenbaum (2007). 

For short-range dependent processes for which 

oo 

(2.5) ©oc = ]T^ 2 <oo, 

i=0 

we have Z n = 0(n) and (2.3) of condition (C2) trivially holds. For long- 
range dependent processes, (2.5) can be violated. A popular model for long- 
range dependence is the fractionally integrated auto-regressive moving aver- 
age process [Granger and Joyeux (1980), Hosking (1981)]. Here we consider 
the more general form of linear processes with slowly decaying coefficients: 

oo 

(2.6) X n = ^ ajen-j where aj = j~ p £(j), 1/2 < (3 < 1. 

3=0 

Here ao = 1, £(■) is a slowly varying function and £j are i.i.d. with Ee^ = 
and Ee 2 = 1. 

Theorem 2.3. Assume (2.6). Let l,u£R be fixed, (i) Assume (CI), 
(C3), (C4), <5i/(l - 8-l) < - 1/2 and 

(2.7) fci/V-^ra) = o(log- 1 / 2 n). 

Then (2.4) holds, (ii) Assume (CI), (C3), (C4), supj/f (x)| < oo and 

(2.8) log 1 / 2 n = o{b 1 J 2 n l ~H{n)). 
Let cp = J °°(x + x 2 )-P dx/[(3 - 2(3) (1 - /?)]. Then 

(2.9) — ^ ^(0,1)|^2L max J*M. 

&yV"^(n) "y/^i^yfjixj 

Theorem 2.3 reveals the interesting dichotomy phenomenon for the max- 
imum deviation A n : if the bandwidth b n is small such that (2.7) holds, then 
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the asymptotic distribution is the same as the one under short-range de- 
pendence. However, if b n is large, then both the normalizing constant and 
the asymptotic distribution change. Let b n = (n), where l\ is another 
slowly varying function. Simple algebra shows that, if max((l + <5)/(l — 
5), 2 — 5) < 2/3, then the bandwidth condition in Theorem 2.3(i) holds. The 
latter inequality requires /3 > y/3/2 = 0.866025, .... If /3 < 1 - 5/2, then (2.8) 
holds. Theorem 2.3(h) is similar to Theorem 3.1 in Ho and Hsing (1996), 
with our result having a wider range of /3. 

2.2. Estimation of //(•) and o~ 2 {-). Let & = (■■■ ,r)i-i,rji,^i). For a func- 
tion h with Eh 2 (r]i) < oo, write 

M n( X ) = \ Y, K (^V^) Zk Wh6re Zk = k{Vk) ~ Ek{7]k) - 
U k=l ^ ' 



Proposition 2.1. Let l,ueR be fixed. Assume a 2 = EZ 2 and E|Zi| p < 



oo, p>2/(l-<5i). (i) Assume (2.2), (CI), (C3)-(C4) and ^ n ^ = 0(n 
for some q > and 7 > 61/ (1 — Si). Then for all z E R, 



(210) P £sup ™_ d < I U e -ae- 

as ra-> 00. (ii) Assume (1.2), (CI), (C2)', (C3)' and (C4) /ic-Zd with in 
(C2)' replaced by^-i. Then (2.10) holds. 

Proposition 2.1 (i) allows for long-range dependent processes. For (2.6), 
by Karamata's theorem, \& nj 2 = 0(n 1//2_ ^£(re)). So we have = 0(n -7 ) 
with 7 > V(l " <*i) if h < (2/3 - l)/(2/3 + 1). 

For ScR, denote by C P {S) = {g{-) : sup x€S \g {k) {x)\ < 00, k = 0, . . . ,p} the 
set of functions having bounded derivatives on £ up to order p > 1. Let 
,S e = Uj,es{ x • l x ~~ 2/1 — e i ^ e * ne e-neighborhood of S, e > 0. 

Theorem 2.4. Lei R 6e /tired and K be symmetric. Assume that 
the conditions in Proposition 2.1 hold with Z n = rj n , f £ (-),fi(-) 6 C 4 (T e ) for 
some e > 0, where T = [l,u], and that b satisfies 

0<5i<l/3, n6 9 logn = o(l) and Z n b 3 = o(n log n). 

Leti/j K = Ju 2 K(u)du/2 and p^x) = n"(x) + 2p'(x)f'{x)/f(x). Then 



n ( pnb V fn(x)\Hn{x) - nix) -b 2 ip K Pfi{x)\ 

p y ~r~ su p r~\ 

V V Ax Kx<u O-(X) 



(2.11) 

' (21og6- 1 ) 1 /2 
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Note that a 2 (x) = E[(Yfc — p,(Xk)) 2 \Xk = x\. It is natural to use the Nadaraya- 
Watson method to estimate o~ 2 (x) based on the residuals = — /t%(Xfc): 

where the bandwidths h = h n — > and n/i„ — > oo, and 



fc=i v 7 



Theorem 2.5. Lei l,uER be fixed and K be symmetric. Assume i/ v = 
Er/f — 1 < oo. Further assume that the conditions in Proposition 2.1 hold 
with Z n = r] 2 l — 1, /(•),<t(-) G C 4 (T e ) for some e > 0, where T = [l,u], and 
that h^b satisfies 

0<5i<l/4, nb 9 logn = o(l) 

and 

Z n b 3 = o(n log n). 
Let p a (x) = 2a' 2 (x) + 2a(x)a"(x) + ia(x)a'(x)f(x)/f(x). Then 




nh yf f n i(x)\o*{x) - a 2 (x) - h 2 ip K Pa(x) 



sup 



^KVr, l<x<u O 2 {X 



(2.12 

' (21og^- 1 ) 1 /2 



where d n is defined as in Theorem 2.1 by replacing b with h = h/{u — l). 

We now compare the SCBs constructed based on Theorem 1 in Zhao 
and Wu (2008) and Theorem 2.4. Assume / = and u = 1. The former is 
over the grid point T n = {2b n j,j = 0, 1, . . . , J n ,} with J n = |T/(26 n )] , while 
the latter is a genuine SCB in the sense that it is over the whole interval 
T = [0,1]. Let p M (-) [resp., <r(-)] be a consistent estimate of p^(-) [resp., <r(-)] 
and z a = — loglog(l — a)" 1 / 2 , < a < 1. By Theorem 2.4, we can construct 
the 1 — a SCB for fi(x) over x G [0, 1] as 

p n (x) - b 2 tp K p^(x) ± ha(x 

(2.13) 

= (21ogfe- 1 ) 1 /2 + dn - 
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Similarly, using Theorem 1 in Zhao and Wu (2008), the 1 — a confidence 
band for fi(x) over x € T n is also of form (2.13) with l\ replaced by 



h 



j M/2 l/21o g logJ ra +log(20F) 

1/2 + ^iogj n ; r u/2 



(21ogJ n )V2 1 v * (21ogJ n )V 



Elementary calculations show that, interestingly, Zi and I2 are quite close: 
- l 2 = (loglogr 1 )/(21og6- 1 ) 1 / 2 (l + o(l)) if K\ > 0. 

3. Application to the treasury bill data. There is a huge literature on 
models for short-term interest rates. Let Rt be the interest rate at time t. 
Assume that Rt follows the diffusion model 

(3.1) dR t = fJ,(Rt)dt + a(R t )dB(t), 

where B is the standard Brownian motion, fi(-) is the instantaneous return 
or drift function and a(-) is the volatility function. Black and Scholes (1973) 
considered the model with fJ,(x) = ax and a(x) = ax. Vasicek (1977) as- 
sumed that n(x) = a® + ct\x and a{x) = a, where ao,a± and a are unknown 
constants. Cox, Ingersoll and Ross (1985) and Courtadon (1982) assumed 
that cr(x) = ox 1 ! 2 and <j{x) = ax, respectively. Both models are generalized 
by Chan et al. (1992) to the form a(x) = ax 1 , with a and 7 being unknown 
parameters. Stanton (1997), Fan and Yao (1998), Chapman and Pearson 
(2000) and Fan and Zhang (2003) considered the nonparametric estimation 
of /i(-) and a(-) in (3.1); see also Ai't-Sahalia (1996a, 1996b). Stanton (1997) 
constructed point-wise confidence intervals which serve as a tool for suggest- 
ing which parametric models to use. Zhao (2008) gave an excellent review of 
parametric and nonparametric approaches of (3.1). See also the latter paper 
for further references. 

Here we shall consider the U.S. six-month treasury yield rates data from 
January 2nd, 1990 to July 31st, 2009. The data can be downloaded from 
the U.S. Treasury department's website http://www.ustreas.gov/. It has 
4900 daily rates and a plot is given in Figure 1. Let Xi = R^ be the rate at 
day i = l,... ,4900. For the daily data, since one year has 250 transaction 
days, ti — = 1/250. Let A = 1/250. As a discretized version of (3.1), we 
consider the model 

(3.2) y 4 = /x(A i )A + a(X i )A 1 / 2 ??l , 

where Y { = R u+1 - R u = X i+1 - X { and m = (B(i m ) - B(t 4 ))/A 1 / 2 are i.i.d. 
standard normal. For convenience of applying Theorem 2.4, in the sequel 
we shall write fJ,(Xi)A [resp., a{X i )I\ 1 ^} in (3.2) as fi(Xi) [resp., crpQ)]. So 

(3.2) is rewritten as 

(3.3) Y i = l i{X i ) + a{X i )r H . 
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Figure 2 shows the estimated 95% simultaneous confidence band for the 
regression function fi(-) over the interval T = [l,u] = [0.35,8.06], which in- 
cludes 96% of the daily rates X{. To select the bandwidth, we use the R 
program bw.nrd which gives b = 0.37. Then we use the R program locpoly 
for local polynomial regression. The Nadaraya- Watson estimate is a special 
case of the local polynomial regression with degree 0. The function p(x) in 
the bias term b 2 tjjKp(x) in Theorem 2.4 involves the first and second order 
derivatives //, /' and //'. The program locpoly can also be used to esti- 
mate derivatives // and //', where we use the bigger bandwidth 2b = 0.74. 
For /, we use the R program density, and estimate /' by differentiat- 
ing the estimated density. Then we can have the bias-corrected estimate 
fin(x) = Hn{x) — b 2 ipKp( x ) f° r which is plotted in the the middle curve in 
Figure 2. To estimate c(-), as in Stanton (1997), we shall make use of the esti- 
mated residuals lj = Yj — /x n (Xj), and perform the Nadaraya- Watson regres- 
sion of e? versus Xi with the bandwidth b. In our data analysis the boundary 
problem of the Nadaraya-Watson regression raised in Chapman and Pear- 
son (2000) is not severe since we focus on the interval T = [0.35,8.06], while 
the whole range is [minXj, maxXj] = [0.14,8.49]. 



■5. -* - 




1000 2000 3000 4000 

U.S. six-month treasury yield curve rates data, 01/02/1990-07/31/2009 



5000 



Fig. 1. U.S. six-month treasury yield curve rates data from January 2nd, 1990 to July 
31st, 2009. Source: U.S. Treasury department's website http : //www. ustreas. gov/. 



SIMULTANEOUS NONPARAMETRIC INFERENCE 



11 



o 
o 
o 



in 
o 
o 
d 
i 



o 
o 




Fig. 2. 95% SCB of the regression function /i(-) over the interval [l,u] = [0.35,8.06]. The 
dashed curve in the middle is fi n (x) — b tpKp(x), the bias- corrected estimate of fj,. 



The Gumbel convergence in Theorem 2.4 can be quite slow, so the SCB in 
(2.13) may not have a good finite-sample performance. To circumvent this 
problem, we shall adopt a simulation based method. Let 

n _ \U =1 K(xt/b- x /bn\ 

where X£ are i.i.d. with density /, rf^ are i.i.d. with Er] n = 0, Erj^ = 1 and 
E|ryi| p < oo, and (X£) and (rjt) are independent. As in Theorem 2.4, let 



Tjt yjj{x)\^n{x) ~ ^{x) -b i) K p{x)\ 

n n = sup . 

By Theorem 2.4 and Proposition 2.1, with proper centering and scaling, 
II n and have the same asymptotic Gumbel distribution. So the cutoff 
value, the (1 — a)th quantile of n^, can be estimated by the sample (1 — 
a)th quantile of many simulated n n 's. For the U.S. Treasury bill data, we 
simulated 10,000 n n 's and obtained the 95% sample quantile 0.39. Then the 

1/2 

SCB is constructed as jl n (x) ± 0.39o-(x)//„' (x) ; see the upper and lower 
curves in Figure 2. 
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We now apply Theorem 2.5 to construct SCB for <r 2 (-). We choose h = b, 
which has a reasonably satisfactory performance in our data analysis. By 
Theorem 2.5, 

„ _ _L_ yTMknfo) - g 2 Qg) - b 2 lp K Pa{x)\ 

LV n — j— SUp _ 

has the same asymptotic distribution as LT n and H' n . Based on the above 
simulation, we choose the cutoff value 0.39. As in the treatment of \s! and 
fi" in the bias term of p n , we use a similar estimate, noting that p a (x) = 
(a 2 (x))" + 2{o 2 {x))'f'{x)/f(x) has the same form as p^(x). The 95% SCB 
of cr 2 (-) is presented in Figure 3. 

Based on the 95% SCB of p(-), we conclude that the linear drift function 
hypothesis Hq : = Qo + ct\x for some ao and a± is rejected at the 5% 
level. Other simple parametric forms do not seem to exist. Similar claims can 
be made for cr 2 (-), and none of the parametric forms previously mentioned 
seems appropriate. This suggests that the dynamics of the treasury yield 
rates might be far more complicated than previously speculated. 
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4. Proofs of Theorems 2.1-2.3. Throughout the proofs C denotes con- 
stants which do not depend on n and b n . The values of C may vary from 
place to place. Let [ - J and [■] be the floor and ceiling functions, respec- 
tively. Without loss of generality, we assume 1 = 0, u = l in (1.5) and A = 1 
in condition (C4). Write 

v no 

%(bt) - Ef n (bt)} = M n (t) + N n (t), 



where M n (t) has summands of martingale differences 
1 n 

M n (t) = ^{K(X k /b - t) - E[K(X k /b - 

^nb\ K f{bt) £-[ 

and, since E[K(X k /b — t)|£fc_i] =b f_ 1 K(v)fx k \£ k _ 1 Q>v + bt)dv, the remain- 
der 

1 n 

N n (t) = ]T{E[K(AV& - " EK{X k /b - t)} 

^nb\ K f{bt) ^ 

f K(v)Q' n {bv + bt)dv, 
i 



yJn\ K f{bt) 



where 



Q n {x) = Y}Fx k \^ 1 (x)-F{x)}. 



k=l 

If X„ admits the form (2.2), we assume ao = 1. Let Y k = g{. . . , £ k -i, £&)■ 
Then /x fc | 5fc -i(^ + & = + bt ~ Y k-i)- 

Proofs of Theorems 2.1 and 2.2. We split [l,n] into alternating big 
and small blocks Hi, h, . . .,H Ln , I Ln , I Ln+ i, with length \Hi\ = [n T1 J , \h\ = 
\n T \, l<i<t n , \I Ln+l \=n-i n {\n^\ + \n T \) and L n = [n/([n^\ + [n T \)\, 
where 5±/j < r <t\ < 1 — 8\. Let m = \I\\, 



u 



(t) = {E[K(X k /b - t)\C k - m ,k] ~ E[K(X k /b - t)\Z k -m,k-l]}, 
(t) = J2{E[K(X k /b - i)|£ fe _ m , fc ] ~ E[K(X k /b - t)|a- m , fc -i]}, 



Mn{t) = i m Tm ^2 Uj{t) ' Rn{t) = i m im ^ Vj{t) - 

yJnb\ K f{bt) ^ yJnb\ K f(bt) 
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Theorems 2.1 and 2.2 follow from Lemmas 4.1-4.3 and Lemma 4.5 below. 

□ 

Proof of Theorem 2.3. Case (i) follows from Theorem 2.1. For (ii), 
since £?=i Y i -. 1 /{cfin 3 / 2 -P£(n)) N(0, 1) [cf. Ho and Hsing (1996)], where 
ii— l = SfcLi°fc e i-fc) it follows from (2.8), Lemma 4.1(h) and Lemma 4.4. 

□ 

Lemma 4.1. Assume (C4). (i) We have 
(4.1) sup \N n {t)\ = P {b 1 l 2 rr 1 ' 2 ® n \ 

where ® n = zl /2 if (X n ) satisfies (2.2) and (C3); 9„ = 0(n x / 2 ) i/ (X n ) 
satisfies (1.2), (C2)' and (C3)'. (ii) For i/ie process (2.6), we have (4-1) 
with 9 n = 0(n 3 / 2 -^(n)), and 



(4.2) sup 

0<t<6- 1 



N n {t)^nb\ K f(bt) - bf'(bt)Y,Yj-i =o(bn 3 / 2 - f} £(n)), 

w/iere ij_ x = YlkLi^j-k- 

Lemma 4.2. Under conditions of Theorems 2.1 or 2.2, we have 
P( sup \M n (t)-M n (t)-R n (t)\>{\ogb~ 1 y 2 )=o{l). 

Lemma 4.3. Under conditions of Theorems 2.1 or 2.2, we have 



(4.3) P sup \R n (t)\ > (log 6" 1 )- 2 =o(l). 

Lemma 4.4. Let sup a ./x„i^ w _ 1 (a;) be a.s. bounded. Assume (C4). Then 



sup |M n (t)|=Op(vTogn). 

Consequently, under conditions of Lemma 4-1, Ef n (x) — f(x) = f"(x)b 2 ipK- l r 
o(6 2 ) and 



\f i \ r( M Op(Vlogn) 0p(6 n ) 2 N 

sup |/ n (x) - = = + + 0(b ). 

o<a;<i \Jnb n 

Lemma 4.4 gives an upper bound of sup < 4 <b-i |M n (t)|. Under stronger 
conditions, one can have a far deeper asymptotic distributional result. By 
Lemmas 4.5, 4.2 and 4.3, it is asymptotically distributed as Gumbel. 
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Lemma 4.5. Under conditions of Theorems 2.1 or 2.2, we have for all 
z £ R that 

(4.4) p( sup \M n (t)\ <x z ) ^e~ 2e ~ z where x z = d n + — — \ 

\)<t<&-i ' (21og6 1 ) L ' 2 

4.1. Proofs of Lemmas 4-l~4-4- 

Proof of Lemma 4.1. We claim that, for any ao > 0, 

(4.5) E[sup |QU^)| 2 1 =0{@l), 

which implies Lemma 4.1(i) in view of 

(4.6) N n {t) = f b f K(x)Q' n (b(x + 1)) dx 

by noting that info< x <i f(x) > 0, J^J-K^u)! du < oo. To prove (4.5), we use 
Lemma 4 in Wu (2003), which implies that 

/ao rao 
\Q' n (x)\ 2 dx + 2a \Ql(x)\ 2 dx. 
-ao J —ao 

We first suppose that (X n ) satisfies (2.2) and (C3). Let 
V k - = E(-\T k )-E(-\T k ^i), keZ, 
be the projection operators. By the orthogonality of V k , we have 

n n / n \ 2 

\\QM\\l= £ W^Q'n(x)\\ 2 2< £ £ 11^16-1(^)113] 
fc=— oo k=—oo \i=l / 

n / n—k \ ^ 

<c E ( £ C =04, 

k=—oo \i=l—k / 

where C does not depend on x. Similarly, we have sup xgR HQ^aOHl < C-E n . 
This proves (4.5). 

To prove (4.5) for (X n ) satisfying (1.2), (C2)' and (C3)', we note that 
sup HPfci^-iWIl! < sn P E\I{Xi <x}- I{X ij{k} < x}\ 

<supP(\Xi-x\<\Xi-X i>{k} \) 

x£R 
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where = G(^k-i^' k ,^k+i,i) an d we used the inequality 

\I{X <x}- I{Y < x}\ < I{\X -x\<\X- Y\}. 
Since swp x \f' x ^ i (x)\ < B , we have 

Fx^W-Fx^x-A) 



< BA, 



which by letting A = + d tl, P ) 1/2 y' ields that 

sup 11^1^(^)111 < c{e]Ll P + 

xeR 

This implies sup^R = 0(n). Similarly, we have sup xGR ||<X(a;)||| = 

0(n). We finish the proof of Lemma 4. 1 (i) . 

We now prove (4.2). For i > 2 write y_i = U + a.j£o + W, where U = 

Y!jA a o £ i-3 and w = T.f= i+ i a i e i-j- Let w ' = Y,T=i+i a 3 e 'i-j- Let °o = 

su Pxi\fe( x )\ + l/e'WI]- By Taylor's expansion, there exists R G [0,1] such 
that 

di : = sup \\f E (x - y_i) - / 6 (x - C/ - + tHeMx -U- che' - W')\\ 

x 

= sup \\-aie fe(x - U - RaiS -W) + aie f' £ {x - U - a^o - W')\\ 

X 

< ||a i e coniin(l,|aie / l + k^ol + \W\ + |W|)|| = o(\ch\). 

Here we use the fact that ||eomin(l, |aj£o|)|| since a, — > 0, and a^o 
and |W| + \W'\ are independent. Since ej,e m , Z,m G Z, are i.i.d., we have 
f(x) = E[f £ (x — U — aj£g — W')|£o]- By the Lebesgue dominated convergence 
theorem, f'(x) = E[f £ (x — U — aie' — W')\£o]. By Jensen's inequality, 

sup||E[/ £ (x - y_0 - f £ (x -U- W)\£o] + ai e f'(x)\\ < 4 , 

X 

which again by Jensen's inequality implies that sup x ||E[/ e (x — y_i) — f e (x — 
U-W)\C-x] < 4 . Since E[f £ (x-U -W)^] = E[f £ (x-U- W)|£o], we have 

sup\\V [f £ (x - y_i) + /'(^y-ilH < 20i = o(\ai\). 

X 

Define = if i < 0. Let T n (x) = Q n (x) + /(a;) £? =1 y_i. If fc < -n, then 

rx(x)ii<^2v fe =Kn|fcr^(|fci)). 

If — n < k < n, by Karamata's theorem, ^^=1 a i = 0{na n ). Hence, 

n 2n 

su V \\v k T' n (x)\\ < E 2 Vfc < E 2 ^- = °(™ 1_ ^W)- 
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Since V k - = E(-|£fc) — E(-|^__i), k £ Z, are orthogonal, 

/ — n n \ 

Hupll^H^sup E + E ll^(x)f = (n 3 - 2 ^ 2 (n)), 



^ = — 00 fc = l 



-n/ 



where we again applied Karamata's theorem implying X^m=n m (jn) 
0(n l ~ 2 ^£ 2 (n)). Similarly, since sup x \f"'(x)\ < oo, we have sup^ ||T"(x)|| 2 
ofaS-Wpfa)). Since T^x) = 3^(0) + JqT^u) du, for all finite a > 0, 



E 

Hence, (4.2) follows in view of (4.6). □ 



sup \T^(x)\ 2 = o(n 3 " 2 ^ 2 (n)). 
|a;|<ao 



Proof of Lemma 4.2. Let Z k>t = K(X k /b-t)-E[K(X k /b-t)\^ mjk ], 
%k.t = Zk,t ~ E( z kMk-i) and 

n 

[nb\ K f(bt)]y 2 [M n (t) - M n (t) - R n (t)]=Y,Zk,t- 

k=i 

We shall approximate Xwc=i %k,t by the skeleton process Ylk=i %k,tj > 1 < J < 
q n , where q n = [n 2 /b\ and tj = j/(bq n ). To this end, for t G [tj_i,ij], under 
condition (C4), if — i and — tj are both in or outside [—1, 1], we 
have 

\K(X k /b - t) - K(X k /b - tj)\ < C\t -tj\< CrT 2 . 

Otherwise, we have either \X k /b — tj — 1| < Cn~ 2 or \X k /b — tj + 1| < Cn~ 2 . 
Let 

n n 

Lj = ^2 hj, L* = E(hj\€k-i), 

k=l k=l 

(4.7) 

n n 

Hj = E(Ikj\£k-m,k) an d H* = ^2 E(/fcj|Cfc-m,fc-l)> 
k=l k=l 

where I kj = I{\b~ l X k - tj ± 1| < Cn~ 2 }. Then 



(4.8) sup 



tj-i<t<tj 



' j 

n 



fc=i 



< — + CLj + CL* + CHj + C#*. 



Since fx n \£ n -i( x ) 1S bounded, E(_?fej |^fc_i) < Cn 2 b. Hence, L* < Cn 1 b and 
D k j = hj ~ E(4j|^fc-i) satisfies E(Z)^.|4_i) < CrT 2 b. Let L<> = maxi<,< ?n L,-. 
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Applying the inequality due to Preedman (1975) to Lj — L* = Ylk=l Dkj, we 
have 



(4.9) 



P(L > 9 logn) < Pf max \Lj - L*\ > 8 logn) + ?( max L* > logn) 

\l<7<<7n Vl<J<l2n 



a<j<<z, 
< 2q n exp 



(8 logn) 



o(n" 2 ). 



-2 x (8 logn) -2Cn~ 1 b_ 

Similarly, we have Hj < Cn b, and, for H<> = maxi<j< 9n Hj, P(-ff<> > 9 logn) 

o(n -2 ). Since logn = o(y/nb/ (log 6 -1 ) 2 ), by (4.8) and (4.9), it remains to 
show that 



(4.10) 



P max 

\ l<j<qr. 



k=l 



> 2~ 1 Vnb(logb~ 



l\-2 



o(l). 



We first consider the case of X n in (2.2). Recall (2.1) for £j£ n . Define 
K x>t (^ 1 ) = K( X + 9 f k - l) -t\ and = tf* |t (e*_i) - K x M-i, J- 

Let W fc = b(£ fc -i) -5(a-i,m)l- % condition (C2), ||W fe || p , = O(m^). By 
Lemma 4.8, we have J^^K^) 2 dx < Cbmm((Wk/b) a , 1). Hence, by Jensen's 
inequality, 



/oo 
(K x ,t(^ k -i) - E[Kx,t(£k-i)\fik-m,k-i}) 2 fe(x) dx 
-oo 



(4.11) 



< E 



" poo 

/ ( K x,t) 2 fs(x) dx £k-m,k-l 
_J —oo 

<CbE[min((W k /b) a ,l)\Ck- m ,k-i]- 
Let V = maxi<j< gn 2fc=i E (^fc,t., J£fc-l)- Since 61/7 < r < 1 -<5i and m ~ n r , 
n6 



(4.12) 



p y > 



(log?)- 1 ) 6 



< C(log&- 1 ) 6 Emin((W fe /6) a ,l) 



< C (log n 



, v& , \ min (p'>«) 

6 I m,p' 



0(1). 



By Freedman's (1975) inequality for martingale differences, we have 



P max 

\ !<?<?» 



fe=i 

< 2g n exp 



> 



nb 



2(log6- 1 ) 2 " " (log^ 1 ) 6 J 
nb(log fe" 1 ) -4 



CVnb(log 5" 1 )- 2 + Cnfr(log & 



o(l) 
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by condition (CI). So (4.10) follows from (4.12). 

The proof of (4.10) for X n in Theorem 2.2 is simpler. Let p\ = min(p, 1) 
and pi G (p, 1). We have, by (C2)' and (C3)', that 

supE|Z M | < CP(\X k - XU > pf) + Cb^p? 
teR 

+ Csup P(\X k -tb±b\< p?) < Cip/pf ) m + Cb^p?. 

t£R 

Hence, using Markov's inequality, (4.10) follows. □ 

Proof of Lemma 4.3. Let A = (log& -1 ) -3 = o((log6 -1 ) -2 ). Recall the 
proof of Lemma 4.2 for tj. Prom the proof of Lemma 4.2, we only need to 
consider the behavior of R n {t) at grids tj. Note that r < t\ and 

(4.13) sup E E i R2 (( X k ~ t)/bMk-i] < C{n l - T ^ +T + n^)b a.s. 
teR j=i keij 

By Freedman's inequality for martingale differences and (4.13), 

A 2 nb 



P( max \R n (tj)\ > A) <4q n exp 

\0<j<q„ W/l 



O 1 



-2CAVnb - 2C(n 1 - r i+ T + n r i )6. 
since n"" 51 = 0(b). Hence, (4.3) follows. □ 

Proof of Lemma 4.4. From the proof of Lemma 4.2, we only need to 
show that 



sup \M n (tj)\ = Op(y / logn), 

0<j<q n 

which follows from sup tgR E[K 2 ({X k — t)/b)\^k-i] — Cb a.s. and Freedman's 
inequality for martingale differences. □ 

4.2. Proof of Lemma 4-5. As in Bickel and Rosenblatt (1973), we split 
the interval [0, into alternating big and small intervals Wi,Vx, ■ ■ ■ , Wn,Vn, 
where Wi = [04,04 +w], Vj = [a* + w,a i+ i], 04 = (i - l)(w + v), a N+ i = b^ 1 
and N = [b^ 1 /{w + v)\ . We will let v be sufficiently small and w be fixed. 
We shall first approximate Q + : = sup 0<t<b -i M n (t) by \& + := maxi<fc<jv T^", 
where T^" := sup tgl y fe M n (t), and then approximate via discretization by 

(4.14) Ef ■= max M n (a k + jax~ 2/a ) where \ = [wx 2/a /a\ , a > 0. 
i<i<x 

We similarly define f2~, ^~ , and by replacing "sup" or "max" by 
"inf" or "min," respectively. Let £1 = sup 0<t<b -i \M n (t)\ = max(f2 + , — 
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Define 

i?i = P( max supM n (i)>x); i?2 = P( min inf M n (t) < — x ) ; 

\l<k<N teVk ' \\<k<Nte.V k J 

N 

i?3 = ^|P(T+>x)-P(~+>x)|; 

k=l 

N 

R4 = Y.\ P ( T k<-x)- p (^<-x)\, 

k=l 

where x = x z = d n + z/(2\ogb~ 1 ) 1 / 2 . To deal with Ri, . . . ,R^, we need the 
following Lemma 4.6 which will be proved in Section 4.3. 

Let (a, C ) = (1, K x ) if K x > and (a, C ) = (2, K 2 ) if K x = 0. Let H a (a) 
and if Q be the Pickands constants [see Theorem Al and Lemmas Al and 
A3 in Bickel and Rosenblatt (1973)]. Note that H x = 1 and H 2 = 1/Vtt. 

Lemma 4.6. Let t > be such that inf{s" a (l - r(s)) : < s < t} > 0, 
where r(s) is defined in Lemma 4-8. Let ip(x) = e~ x / 2 /{x\/2tt). Under con- 
ditions of Theorems 2.1 or 2.2, we have for a > 0, 

(\tx 2 / a /a\ \ 
J {M n {v + jax- 2 ' a )>x}\ 

(4.15) 

= X V^ X )^M. cl' a t + o(x 2 />(x)) 

uniformly over < v < b^ 1 . The limit version of (4-15) with a — > also 
holds: 



P( U {M n (v + S)>x}\ 
\0<s<t J 



(4.16) 

= x 2/a i>{x)H a Cl ,a t + o(x 2 />(x)). 

The left tail version of (4-15) and (4-16) also hold with "> x" replaced by 
"< -x." 

By Lemma 4.6, elementary calculations show that, for x = x z , 
(4.17) UMRj := lim lim sup lim sup ^ = 0, j = l,...,4. 

Note that J7 + = maxi<i.<jv sup i6 ^ uy M n (t). By a similar identity for f2~, 
we have 

| P(ft > x) - P({^ + > x} U < -x})| < R t + R 2 , 
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which implies LIM|P(fi > x) - h(x)\ = for 

/ N N \ 

(4.18) h(x) = P (J {- + > x} U (J {H- < -x} 

\fc=i fc=i / 

in view of |P({^+ > x} U < -x}) - /i(x)| < R 3 + R A . So (4.4) follows 
from Lemma 4.7 below which will be proved in Section 4.4. 

Lemma 4.7. Recall (4-17) f° r the definition of the triple limit LIM. Un- 
der conditions of Theorems 2.1 or 2.2, we haveLlM\h(x z ) — (1 — e~ 2e z )|=0 
for all z £ R. 

4.3. Proof of Lemma 4-6. We need the following lemma. 

Lemma 4.8 [Theorems Bl and B2 in Bickel and Rosenblatt (1973)]. Un- 
der condition (C4), for r(s) = j K(x)K(x + s) dx / 'Xk , we have as s — > that 

r(s) = 1 - !W- K{* + *)?** = 1 _ Colsr + oQsn 

Now we prove Lemma 4.6. Assume Co = 1. The general case follows from 
a simple scale transform. Let Sj = j'/(logn) 6 , 1 < j < t n , where t n = 1 + 
L(logn) 6 iJ, s tn = t. Write [sj-i,Sj] = [f k n = i[sj,k-u s j,k], where q n = [(sj - 
Sj_i)n 2 J = [n 2 /(logn) 6 J and s jik - s Jjfe _i = (sj - s,,_i)/g n . Define Fj(s) = 

M n (v + s) — M n (v + Sj-i). Using the arguments in (4.8) and (4.9), we have 

As := K^ Sj J7 s <J T ^ - Tj{Sj ' k - l)l > - 

Let M = 2\/n&(logn)~ 4 . By truncation and Bernstein's inequality, 
A 2 := qnmaxPdTjisj^l > (logn)" 2 /2) 



< q n max 



{ — J +eXP l M ) 



exp 



+ q n P\ 



> Vnfe(logn) 2 /A 



-CI" 

i=l 

where = TJ{\Ti\ > \/nfe(logn)" 4 }, 7} = ui(v + s Jjfc ) - uj(t; + Sj_i), and 

< ^ Iflj-lE^Xi/b - v - s j>k ) - K(X 1 /b -v- s,_i)) 2 

< ^ |H,-|C76|a i>fc - Sj-il < C7n6(logn)- 6 . 
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Here we applied Lemma 4.8. Since t\ < 1 — 5\ and n~ Sl = 0(b), for any 
Q>2, 

(4.19) E\uf | 2 < C(nby Q / 2 {logn) iQ n TliQ+2 ^ 2 b < CVr r «, 
where tq — > oo as Q — > oo. So A2 < Cn~ 2 Q for any Q > 0, and 

Ai:=pf max sup |l\- (s) I > (log n)~ 2 ) = < Cn~ Q 

Vl<i<*n Sj _ 1<S < Sj . / 

for any Q > 0. Then we have the discretization approximation 

P( sup M n (v + s) > x) <P( max M n (v + s,-) > sc - (logn)" 2 ) + Ai. 

We now apply the multivariate Gaussian approximation result in Zaitsev 
(1987) to handle M n (v). To this end, we introduce 

Mn(t) = -_==£ft,(t) 

y/nbX K f(bt) ~ 

(4.20) where fy(t) = uj(i) - Ei£(t), 

u|(t) = u i (t)J{|u i (<)| < v^(logn)- 20 }. 
As in (4.19), we have for any large Q, 

(4.21) sup max \\uAt) - Uj (t)\\ < Cn~ Q . 

By (4.21) and Theorem 1.1 in Zaitsev (1987), we have for all large Q, 
P( max M n (v + Sj) > x — (log n)~ 

\l<j<t„ 

(4.22) <?( max M n (v + sA >x- (logn)" 2 ) + Cn~ Q 

\l<j<tn J 

< P( max FnOO ><) +C^ 2 expf- C(1 ° 5 g /2 re)18 ) +Cn-Q 

where x' n = x — 2(logn) -2 and (1^(1), . . . , Y n (t n )) is a centered Gaussian 
random vector with covariance matrix 

(4.23) S n = Cov(M n (t; + si), . . . ,M n (v + s t J). 

By Lemma 4.9 below and Lemma A4 in Bickel and Rosenblatt (1973), we 
have 

P( max Y n (j) > x' n ) < p( max Y n ( Sj ) > x' n ) + 
\i<j<i n / Vi<j<t n / exp(x„ /2) 



< P max F n (s ? ) >x')+Cb 
\i<j<t n ■>' 



1+5 
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for some 5 > 0, where Y n (-) is a separable stationary Gaussian process with 
mean and covariance function r(-). By Lemma A3 in Bickel and Rosenblatt 
(1973) and some elementary calculations, 

P( max Y n (sj) > x' n j < P( sup Y n (s) > x' n j 

\l<3<t„ J H)<s<i 7 

= x 2/a ^(x)H a t + o{x 2/a ^{x)). 

This implies the upper bound in (4.16). With the same argument, for any 
a > 0, 

P( sup M n (v + s) >x) 




1+6 



[tx 2 ' a /a] 

- Yj P(x<Y n (jax- 2/a )<x + 2{logn)- 2 )-Cb 1+5 
i=i 

= ^W^i + (x 2 l a Mx)). 
a 

Then the low bound in (4.16) is obtained by (A20) in Bickel and Rosenblatt 
(1973), letting first n — > oo and then a — > 0. 

Using a similar and simpler proof, we can prove (4.15). 

Lemma 4.9. For the covariance matrix T, n defined in (4-23), we have 
(4.24) |E n - ( r ( s i - a i))i<ij< J <Ct 2 n {b + n-™) for some w>0. 

Proof. Let E n = Cov(M n (v + si), . . . ,M n (v + s t J). By (4.21), |S n - 
S n | < Cn' Q for any Q > 0. Note that E(i^(t)) < Cn T ~ T1 and n > r. Then 

|Cov(M n (s),M n (t)) - Cov(M ra (s) + R n (s),M n (t) + R n (t))\ < Cn T ^ T ^ 2 . 

By (4.11), we obtain that \\M n (t) + R n (t) - M n (t)\\ 2 < Cn 51 '^. Thus, 

\Cov(M n (s),M n (t)) - Cov(M n (s) + R n (s),M n (t) + R n (t))\ < Cn s ^ 2 ~ T ^ 2 . 
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Since K(x) = if |x| > 1, for < s,t < ft" 1 , we have 



\E[K(X k /b - s)K(X k /b - t)\ - b^f(bs)f{bt)r{s - t)\ K \ < Cb 2 . 
Note that E(\K(X k /b - t)||&-i) < Cb. Therefore, 

\Cov(M n (s),M n (t))-r(s-t)\<Cb. 
Combining the above arguments, we prove (4.24). □ 

4.4. Proof of Lemma 4.7. Let M n (t) be defined in (4.20) with 20 therein 
replaced by 20d. Also, d may vary accordingly. Let x n = x ± (logn)~ 2d and 





= {M n (a k +jax 


-2/a 


) >x}U{M n {a k +jax- 2/a ) <- 


-x], 


%j 


= {M n (a k +jax 


-2/a 


)>x n }U{M n (a k +jax- 2 / a )< 






= {Y n (a k + jax~ 




>x}U {Y n {a k +jax- 2 ' a ) <-x}, 


D £i 


= {Y n (a k + jax~ 


2/") 


>x n }U{Y n (a k +jax" 2 ^ a )<- 


•En} j 


»tj 


= {Y n (a k + jax~ 


2/«) 


>x n }U{Y n {a k +jax- 2 ^ a )<- 


%n\ i 



where Y n (-) and Y n (-) are centered Gaussian processes with covariance func- 
tions 

Cov{Y n { Sl ),Y n {s 2 )) = Cov(M n (*i),M„(« 3 )), 
Cov(y n ( Sl ),y n (s 2 )) = Cov(M n (ai),M n (a 2 )), 
respectively. Recall (4.14) for x- Let 

XXX X 

A*=U B *J- C * = U D M- c t = \JK and e * = U 6 i- 

j=i j=i j=i j=i 



Lemma 4.10. Let N = \b 1 /{w + v)\ . Under the conditions of Theorems 
2.1 or 2.2, we have for any fixed integer I satisfying 1 < I < N/2 that 



N 



21-1 



\k=l / d=l K l<ii<-<i d <N 1 7 V=l / 

where C\ does not depend on I, and I is defined in (4-26). 
Proof. By Bonferroni's inequality, we have 

A,- | 



21 f d 

eh'-' e p n 

d=l l<ii<-<i d <JV \j=l 



c 



(21)] logn : 
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(4.25) 

(N \ 21-1 / d \ 

l>* <E(-r £ p • 
k=l / d=l l<h<---<i d <N \j=l / 

We now estimate the probability P(Hj=i A^.). Recall = [a^, + w). Let 
qj = ij+i —ij, 1 < j ' < d — 1. Define the index set 

(4.26) X:={l<n < • • • < ^ < iV : min ^ < [2U1" 1 + 2j ). 

I i<i<d— l > 

Let < d < d - 2 and 

X^ = {1 < i\ < ■ ■ ■ < id < N : the number of j such that qj > [_2w; _1 + 2j is do}. 

Then we have I = U<^~=o-^<V We can see that the number of elements in 
the sum £t P(D?=i A*,-) is bounded by CN do+1 = Oft - * -1 ), where C is 
independent of N . Suppose now ii, . . . ,«d are in X^ . Write 

ri A b= u ••• u - ■ 

Without loss of generality, we assume q\ < [2W- 1 +2\,q 2 > [2W- 1 + 2J , . . . , 
q do+1 > [2W- 1 + 2J. By (4.21) and Theorem 1.1 in Zaitsev (1987), we have 
for all large Q, 

P(B hJl n • • • n B idJd ) < p(Br n • • • n B~ • ) + Cn~« 

(4.27) 

< P(D- Ji n • • • n D- Jd ) + Cexp(-(log6- 1 ) 2 ) + Cn^. 
By (4.21), we have uniformly in s\ and S2 that, for any large Q, 
(4.28) |Cov(Y„(si), Y n (a 2 )) - Cov(Y n ( Sl ), Y n (s 2 ))| < Cn~ Q . 

Using the argument of (4.24), there exists C > and w > 0, such that for 
v n = C{b + n~ m ) and any 1 < j'q < %, we have 

|Cov(Y n (a ii +j z ax- 2 / a ), Y n (a ifc + j fc a:zf- 2 / a ))| < ^ n 

for 3 < k < d + 1,1 = 1,2; 
|Cov(Y n (a is +j>x- 2 / a ), Y„(a ifc + j fc ax- 2 / Q ))| < i/ n for 3 < k ^ s < d + 1; 

|Var(Y n (a ife + ifcax" 2 /")) - 1| < i/ n for 1 < A; < d + 1; 
and, letting fi = r(a i2 - a h + (j 2 - ji)ax~ 2/a ), 

|Cov(Y n (a il +j 1 ax~ 2/a ),Y n (a i2 + j 2 ax~ 2/a )) -fi\< v n . 
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Note that |j2 — j\\ax~ 2 l a < w and aj 2 — > w + v and sup a;>1 ,|r(x)| < 1. 
Let any 1 < jr\ < \ and V n be the covariance matrix of the Gaussian vector 

(Yi, . . . , Y do+ i), where Y k = Y n (a ik + j k ax~ 2 / a ), l<k<d. Using the bounds 
of the covariances above, we have for some 5 > that 

(4.29) | V„ - V| < Cn~ s where V = f X 1 ° ) and V x - ' 1 



o id _i y 1 1 

By (4.29), we have 

(4.30) IV" 1 -V" 1 1 <Cn~ s and | \/det(V) - Vdet(V„)| < Cn~ 5 . 

Let Pn{y) be the density of (Y\, . . . , Y^ 0+ i), an d p(y) be the density of the 
Gaussian random vector with covariance matrix V. By (4.30), we have 

\pn(y) -p(y)\ < Cn- s P { y ) + CeM-y^ 1 y'/n^MCn~ s \y\ 2 ) - 1| 

(4.31) 

< C{n~ & + n~ s {\ogn) 2 )p{y) + Cexp(-(logn) 2 /C). 
Hereafter, 5 > may be different in different places. Note that 

I A 4 1 < sup \r(x)\ < 1. 

Then it follows from Lemma 2 in Berman (1962) that, for some 5 > 0, we 
have 

(4.32) <(1 + Cn~ 5 )/ p(y)dy + Cexp(-(logn) 2 /C) 

< Cb do+1+s , 
where y = (yi, . . . , yd 0+ i) and 

do+l 

s± = n k% - x ^ u - " x «^- 

Noting that x d = ©(ft- 5 / 2 ) and by (4.27) and (4.32), we have for some 5 > 0, 

(4.33) EE p (rW)^- 

d =0 l dQ \j=l / 

We now estimate 

(«*> ( £ -EWrWY 

V l<u<--<i d <jV 1 7 \j=l / 
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Suppose that i\ , . . . , id ^ X. Since ij + i — ij > [2/w + 2 J , we have ai J+1 — ai 3 > 
(w + v) [2/w + 2j > 2 + w + v. Then, for 1 < s ^ k < d, 1 < j s , j fe < x, 

\Cov(Y n (a ls +j s ax~ 2 ' a ),Y n {a ik + j k a X - 2 ' a ))\ < C{b + n~™) 

holds for some w > 0. By the bounds of the covariances above, the covariance 
matrix V n of (Y"i , . . . , Yd) when i\ , . . . , id ^ T satisfies 



(4.35) 



I V n - II < On- 



tox some 5 > 0. 



For the probability in the sum in (4.34), as in (4.27) and (4.32), we have for 
n large, 



p n A h )< e • • • e p ( B ^ a n • • • n 

<E---E p ( fi ^ 1 n---nD-. d ) + ^ 

ji=i id=i 
x x 

- 2 ^ E " " " E ( x ~ 1 exp(-x 2 /2)) d + Cb d+S + Cn" Q 
ii=i id=i 

< 2 d (xrE" 1 exp(-x 2 /2)) a! + Cb 1+S < Cfb d + C& d+<5 

for some C\ > which does not depend on d. This together with (4.33) 
implies that 



(4.36) 



E P (n A ^) <Cf/d\ + Cb s 

l<h<-<i d <N \j=l / 



for some C\ > which does not depend on d. To prove Lemma 4.10, by 
(4.25), (4.33) and (4.36), we only need to show that, for ii,...,i^^X, 



(4.37) 



p (iM- p Cq' 



<Cb d (log n) 



By (4.21) and Theorem 1.1 in Zaitsev (1987), as in (4.22), it suffices to show 

<Cb d {\ogn)- d . 



By (4.28) and Lemma A4 in Bickel and Rosenblatt (1973), using P(D^=i z 
1 — P((Jj=i C^ c ) and the inclusion-exclusion principle, we have for any large 
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Q, 
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<C X 2 n~ 2Q <Cn~ Q . 



So it suffices to show that 
(4.38) 



p(nc,)-p(nc,t) 



<Cb d (\ogn)- d . 



\j=l / \j= 

By (4.35) and a similar inequality as (4.31), we have, for some 5 > 0, 

|P(D± h n • • • n D± jd ) - (P(D±)) rf | < Cb d + S , 

where D 1 * 1 = {N > x n } U {N < — x n } and N is a standard normal random 
variable. It follows that, for some <5 > 0, 



X X 



< E • • • E n • • • n - p (Pin n • • • n D 

ii=i id=i 



*d ,Jd J 



X X 



= E ■ • ■ E i( p ( D ~)) d - ( p ( D+ )) d i + Chd+& - 

ii=i id=i 

So (4.38) follows from P(D~) - P(D+) < C{\ogn)~ 2d b and P(D ± ) < Cbj 
(logfe^ 1 ) 1 /". The lemma is then proved. □ 

We are ready to prove Lemma 4.7. Let {s^}j e z ; 1 < A; < n, be i.i.d. 



copies of { £i } ieZ , and fj* 5 = (. . ..^.efj.kt = GfcfO- Then Z^, 

1 < fc < n, are i.i.d. Now define A' k , M' n {t), M' n (t), N' n (t), R' n {t), R[, ...,Ef 4 

by replacing and {e{} by and {e^ }, respectively, in the above 

proofs. Repeating the arguments above, we can obtain that 



(k) (k), 



r(k) 



\k=l / d=l x l<ii<-<i d <N 1 7 \j=l 

By letting n — > oo and then I — > oo , we have 



C? 0(1) 
- (2Z)! logn' 



lim sup 



\fe=i / \fc=i 



0. 
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Similarly, (4.17) holds with Rj therein replaced by i?'-. Hence, as n-> oo, 



(4.39) 



LIM 



p(\J A' fc )-P( sup \M' n {t)\<x 

\k=l J W&- 1 



0. 



Note that Lemmas 4.1-4.3 also hold for (xj®) keZ , M' n {t), M' n {t), N' n {t), 
R' n (t). By the theorem in Rosenblatt (1976), the second probability in (4.39) 
converges to e _2e ~ . This completes the proof. 

5. Proofs of Proposition 2.1, Theorems 2.4 and 2.5. Without loss of 
generality, we assume / = 0, u = l. We first introduce the truncation 

Z k = Z k I{\Z k \ < (logn) 12 /^ 2 )} - E(Z k I{\Z k \ < (logn) 12 ^- 2 )}), 

Z k = Z k I{\Z k \ > v^Alogn) 4 } - E(Z k I{\Z k \ > ^/(logn) 4 }) 

and Z k = Z k — Z k , 1 < k < n. Correspondingly, define 

r n (x) = — ^= \\k( ^ -x ]Z k =: — jL= V" w n>k {x), 

r n ,i(x) = — jL=V].K"( ^ -x )Z k =: — ^= Y] w ntk i(x), 

1 n 

r n ,2(x) = r n {x) - r n> i(x) =: - 1 =y^ j w n ^ k2 {x). 

• 'no 



k=l 



Lemma 5.1. Under the conditions of Proposition 2.1, we have 
P( sup |r n (ac)| > 3(logn)" 2 ) =o(l). 



Proof. Since b > Cn~ Sl and E|Zi| p < oo, p > 2/(1 - Si), for n large, we 
have 



(5.1) 



E sup \r nA {x)\ <Cn{nb)- p / 2 {\ogn) Ap - A 

0<x<b- 1 



< GV-^-^Oogra)^- 4 < (logn)" 3 . 



We now deal with r n ^. Let q n = [n 2 /b\, tj =j/(bq n ), j = 0, . . . , q n . As in 
(4.8), we have 

(5.2) max sup |r B , a (f) - r n , 2 (t,)| < — ^ + C ™*^" L i . 
a<j<qn tj <t<t j+1 n(logn) 4 (logn) 4 
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By (4.9), (5.1), (5.2) and since r n ^{x) + r nt i(x) = r n (x), it suffices to show 
(5.3) P (o™<* Kafe)! > 2(logn)" 2 ) = o(l). 



Note that E{Zl) < C(logn)- 12 . By (C3) [or (C3)'], we have 



(5.4) 



max E [^n,fc2(^)l6-2] < Cn6(log 



n) 



0<j<q n 



k=l 



Thus, (5.3) follows from (5.4) and applying Preedman's inequality to mar- 
tingale differences {w n ^ k 2{x),k = 1,3,...} and {w n ^ k 2{x),k = 2, 4, . . .}. □ 

Proof of Proposition 2.1. Let m = [n T \, where £1/7 < r < 1 — 81, 
and 

X k 



Z k {t) = Z k {K[-^-t 



k—m,k 



Kk<n. 



Note that {Z\(t), Z^ft), . . .} and {^(i), Z^t), . . .} are two sequences of mar- 
tingale differences. As in the proof of Lemma 4.2, we can show that 

J2 Z M-l(t) > v^logn)- 2 j =o(l), 



sup 

,0<t<6" 1 



(5.5) 



Set 



k=l 



sup 

,0<t<b" 1 



n/2 

I>2 fc (t) 
k=l 



> Vnb(logn) 2 

> Vnb(logn)~ 2 



ol). 



N n (t) 



y/nb\ K f(bt) 



m,k— 1 



z k . 



Since sup t E({Z k E[K(X k /b - f)|6c- m ,fc-i]} 2 |£fc-i) < Cb 2 , we have by Freed- 
man's inequality for martingale differences, 



P( max \N n {tj)\ > (logn)" 

\0<j<q„ W/ ' 



which, together with the discretization approximation as in (4.8), yields that 



(5.6) 



P sup \N n (t)\ >2(logn)~ 

^0<i<b- 1 



0(1). 



Set ex 2 = EZ 2 and 
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Following the argument of Lemma 4.5 and replacing the truncation lev- 
els (log re) -20 and (logre) _2M in (4.20) and the proof of Lemma 4.7 with 
(logn)~ 20p ^ p ~ 2 ^ and (logn)~ 20pd ^ p ~ 2 \ respectively, we can get 

(5.7) p((21og6- 1 ) 1 / 2 ( sup \M n {t)\ - d n ) < z) -> e~ 2e ~\ 

Note that |1 — <7 2 /cr 2 | = 0((logn) -12 ). The proposition follows from Lemma 
5.1 and (5.5)-(5.7). □ 

Proof of Theorem 2.4. Write (/j, n (x)—fj,(x))f n (x) = R r n {x) + M^x) , 
where 

k=i ^ ' 

k=l V J 

Then Theorem 2.4 follows from Lemmas 4.4, 5.2 and 5.3 and Proposition 
2.1. □ 



Lemma 5.2. Under the conditions of Theorem 2.4, we have 



sup \R r n {x) - b %l) K Pfi{x)\ = Op(r„) where r n 

0<x<l 



n 



Proof. Set 7fc (x) = K((X k -x)/b)((i(X k ) -n(x)). Let g n = [n 2 /b\,tj = 
j/q n , j = 0,...,q n . Since /i(-) e C 4 (T e ), maxo<y< 9n E[ 7 2 (^)|£fc-i] < Cb 3 . By 
Freedman's inequality for martingale differences, we have 



max 

0<j<Qr, 



X>fc(i;)-E[ 7 fc(^-)l&-i]) 



fc=i 



Op(v n& 3 logro), 



where we used the condition < 5i < 1/3. Recall that K(x) and m(x) are 
Lipschitz continuous in [—1,1]. Using the discretization approximation as in 
(4.8) and the argument in (4.9), it can be seen that 



sup 

0<x<l 



k=l 



P (v 7 ^ 3 log re . 



The rest of the proof is the same as that of Lemma 2(h) in Zhao and Wu 
(2008). □ 
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Lemma 5.3. Under the conditions of Theorem 2.4, we have 

'X k -x 



sup 

0<x<l 



Ki 



k=l v 



a(x)rj k 



Or 



Mogn 



n 



Proof. Let 



Vk = Vk I {\'nk\ > Vn6/(logn) 4 } - E(rj k I{\r] k \ > Vn6/(logn) 4 }), 
a(X k ) - a(x))r} k , 



w nk (x) = K\ 



w nk (x) = K 



V b 

X k -x 



(a(X k ) - cr(x))r}k, % = i] k - rj k . 



Note that sup x€T ,\K((X k - x)/b)(a(X k ) - a(x))\ <Cb. Then 



Esup 



Since sup xeR E[fi5^ fc (x)|^ fc _ 2 ] < Cb 3 , we have 



supVE[^ fc ( :E )|£ fc _ 2 ]<Cn& 3 . 
* eR fc=i 

Using the arguments for (5.2) and (5.3), we can show that 

1 n 



sup 

0<x<l 



fc=l 



Of 



fologn 



The lemma is proved. □ 



Proof of Theorem 2.5. Write 

n 



k=l 



Xu — x 



h 



(5.8) 



+ 



+ 



k=i v 7 



nhf nl (x) ^ 

>A / X k - x 



nhf n i(x) \ h 

: o- 2 nl {x) + c n2 {x) + a 2 n z{x). 



\p(X k ) - fi n (X k )]o-(X k )rik 
[fi(X k ) - /i„(X fc )] 2 
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We have 



sup o- n3 x =0 P — — + 6 
o<x<i V nb 



(5.9) 



1 n 

0<x<l ra/t f-f 
— — fe=l 



A' 



JX k -x 



h 



logn +6 4,. 
nb 



Using a similar argument as in Zhao and Wu [(2008), page 1875] we have 
(5.10) 

For al x (x), 



sup |c„2(a;)| =Op[ . ,. 



{a nl {x)-a (x))f nl (x) 

'Xf; — x 



i 



nh 



fc=i 



(5.11) 



n 

n/i ^ 



fc=l 



h 

X k — x 
h 



<y\x)(r,l-i) 

(a 2 (X k )-a\x))(4-l) 



1 n 



nh ' \ h 
k=i 



Xu — x 



{a\X k )-a\x)) 



=:M r n2 (x)+R r n2 (x)+R r n3 (x) 
As in the proof of Lemma 5.3, we get 



(5.12) 



SU P \ R n2( x )\ =Op\ 
0<x<l 



blogn 



n 



Also, for R r n2 (x), we have similarly as in Lemma 5.2 that 



(5.13) 



sup \R r n2 {x) - h ip K p a (x)\ = Op(T n ). 

0<x<l 



Theorem 2.5 now follows from Lemma 4.4, Proposition 2.1 and (5.8)-(5.13). 
□ 
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